WIP: ENH: igenerator pkl correctness fixes + CastXML content-addressed cache (supersedes #6486)#6533
Conversation
igenerator.py writes N pkl files per module as side effects that ninja cannot track because their names (ClassName.SubmoduleName.pkl) are not enumerable at CMake configure time. When pkl files are deleted while the .index.txt byproducts survive, ninja considers igenerator up-to-date and pyi_generator.py fails with "No pickle files were found." Add a --pkl_stamp argument to igenerator.py. The stamp is written after all pkl files for the module are complete and is declared as a CMake OUTPUT of the igenerator add_custom_command. Ninja now re-runs igenerator whenever the stamp is absent, which guarantees the pkl files are regenerated before pyi_generator.py reads the .index.txt manifests.
b98089d to
afbe436
Compare
|
| Filename | Overview |
|---|---|
| Wrapping/Generators/CastXML/itk-castxml-cache.py | New 636-line two-level content-addressed CastXML cache; logic is well-structured with correct atomic-rename store, gzip compression, multi-root cascade, and L1/L2 key hierarchy. LRU eviction via os.fork() is silently disabled on Windows with no fallback. |
| Wrapping/Generators/SwigInterface/igenerator.py | Replaces per-class .pkl files with a WAL-mode SQLite DB; adds stamp file output for correct ninja graph ordering. Default 5-second SQLite busy timeout may be too tight for 96 concurrent writers; manifest table is populated but never queried by pyi_generator.py. |
| Wrapping/Generators/Python/itk/pyi_generator.py | Migrated from file-path-based .pkl loading to SQLite key lookups; correctly reads index.txt for keys then queries pkl_data. DB opened without CREATE TABLE guard — an unexpected ITK_WRAP_CACHE path would give an obscure OperationalError. |
| Wrapping/macro_files/itk_end_wrap_module.cmake | Adds per-module stamp file as CMake OUTPUT and passes --module_name/--pkl_stamp to igenerator; correct fix for the ninja graph gap where stamp absence now forces igenerator to re-run. |
| Wrapping/macro_files/itk_auto_load_submodules.cmake | Correctly builds _castxml_cmd to prepend the Python cache wrapper when ITK_WRAP_CASTXML_CACHE is ON; appends cache script to the existing _castxml_depends list which is already referenced in the add_custom_command DEPENDS clause. |
| CMake/itkWrapCastXMLCacheSupport.cmake | New CMake module adding ITK_WRAP_CASTXML_CACHE option (ON by default) with validation of script path and Python3_EXECUTABLE; cleanly separated from the wrapping core. |
| .github/workflows/arm.yml | Adds ITK_WRAP_CACHE env var and cache/restore steps for the castxml cache; correctly gated on matrix.python-version presence. |
| Testing/ContinuousIntegration/AzurePipelinesLinuxPython.yml | Adds Cache@2 task for CastXML cache with per-SHA key and OS/pipeline fallback restore keys; ITK_WRAP_CASTXML_CACHE:BOOL=ON passed to cmake configure. |
| Testing/ContinuousIntegration/AzurePipelinesMacOSPython.yml | Mirrors the Linux pipeline changes for macOS; cache key correctly uses 'macOSPython' tag to isolate from Linux entries. |
| Testing/ContinuousIntegration/AzurePipelinesWindowsPython.yml | Same CastXML cache integration for Windows; ITK_WRAP_CACHE uses semicolon-separated path convention handled correctly by the Python scripts. |
| pyproject.toml | Adds ccache dependency to pixi feature.cxx and introduces configure-python-ci / build-python-ci / test-python-ci tasks with ITK_WRAP_CASTXML_CACHE enabled; clean separation from the existing configure-python task. |
Sequence Diagram
%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
participant CMake as CMake/Ninja
participant Cache as itk-castxml-cache.py
participant CastXML as CastXML binary
participant IGen as igenerator.py
participant DB as itk-pkl-v1.db SQLite WAL
participant PyiGen as pyi_generator.py
CMake->>Cache: invoke with cxx+inc args
alt L1 HIT and L2 HIT
Cache-->>CMake: restore output.xml.gz, no subprocess
else L1 miss, run castxml -E for L2 key
Cache->>CastXML: castxml -E preprocessor pass
CastXML-->>Cache: preprocessed output
alt L2 HIT
Cache-->>CMake: restore output.xml.gz, refresh L1 map
else L2 miss
Cache->>CastXML: full castxml run
CastXML-->>Cache: output.xml
Cache-->>CMake: store to L2, write L1 map
end
end
CMake->>IGen: igenerator.py per module, ~96 parallel
IGen->>DB: WAL transaction DELETE manifest then INSERT pkl_data and manifest
IGen-->>CMake: write .index.txt with DB keys, touch Module.stamp
CMake->>PyiGen: pyi_generator.py after stamp dependency
PyiGen->>DB: SELECT data FROM pkl_data WHERE key equals key
DB-->>PyiGen: pickled ITKClass bytes
PyiGen-->>CMake: write .pyi stubs
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
participant CMake as CMake/Ninja
participant Cache as itk-castxml-cache.py
participant CastXML as CastXML binary
participant IGen as igenerator.py
participant DB as itk-pkl-v1.db SQLite WAL
participant PyiGen as pyi_generator.py
CMake->>Cache: invoke with cxx+inc args
alt L1 HIT and L2 HIT
Cache-->>CMake: restore output.xml.gz, no subprocess
else L1 miss, run castxml -E for L2 key
Cache->>CastXML: castxml -E preprocessor pass
CastXML-->>Cache: preprocessed output
alt L2 HIT
Cache-->>CMake: restore output.xml.gz, refresh L1 map
else L2 miss
Cache->>CastXML: full castxml run
CastXML-->>Cache: output.xml
Cache-->>CMake: store to L2, write L1 map
end
end
CMake->>IGen: igenerator.py per module, ~96 parallel
IGen->>DB: WAL transaction DELETE manifest then INSERT pkl_data and manifest
IGen-->>CMake: write .index.txt with DB keys, touch Module.stamp
CMake->>PyiGen: pyi_generator.py after stamp dependency
PyiGen->>DB: SELECT data FROM pkl_data WHERE key equals key
DB-->>PyiGen: pickled ITKClass bytes
PyiGen-->>CMake: write .pyi stubs
Reviews (1): Last reviewed commit: "STYLE: Explicitly set ITK_WRAP_CASTXML_C..." | Re-trigger Greptile
|
Yes — this file is the core deliverable of the PR.
|
f3471d1 to
35fd9b6
Compare
igenerator.py now writes all pickle data for a module into a single WAL-mode SQLite database (itk-pkl-v3.db) kept in the build tree's ITK_PKL_DIR. The DB is keyed by bare class.submodule strings and is intermediate handoff state between igenerator (writer) and pyi_generator (reader), so it stays local to one build tree and is never shared via ITK_WRAP_CACHE. The .index.txt manifests now hold DB keys instead of file paths. pyi_generator.py reads keys from the manifests, queries the DB, and prunes any row whose key is absent from the current build's manifests (exact keyset cleanup, replacing the previous per-tree *.pkl glob). The schema version is embedded in the filename so a stale database from an older schema is ignored rather than migrated. itk_end_wrap_module.cmake passes --module_name to igenerator to identify the manifest partition; the per-module stamp file (.stamp) remains the CMake OUTPUT that ninja tracks. Assisted-by: Claude Code — design, implementation, and pre-commit validation
pygccxml is now imported lazily via _load_pygccxml() called in main() after the submodule_names_list is populated from the mdx files. The pygccxml_config is constructed immediately after, so the processing loop is unchanged. This separates the submodule-discovery phase (pure file I/O) from the pygccxml/CastXML-dependent phase, enabling a future cache-hit early return before pygccxml is ever loaded.
On 2-core CI runners, 816 CastXML invocations (~32 min) account for ~10% of a cold Python wrapping build. A warm cache reduces that to ~23 sec, cutting total build time by 32 min on Azure DevOps (19%) and 3h 12m on GHA when combined with a warm ccache (64%). itk-castxml-cache.py wraps the CastXML binary transparently: - L1 key: sha256 of cxx + castxml.inc + flags (no subprocess, ~0.2s) - L2 key: sha256 of castxml -E preprocessed output with line markers stripped, making keys path-independent across build directories Enable via ITK_WRAP_CASTXML_CACHE (default ON). Set ITK_WRAP_CACHE to override the cache root (default ~/.cache/itk-wrap). CI wiring: Cache@2 tasks added to Linux, macOS, and Windows Azure DevOps pipelines; arm.yml updated for GHA. pixi tasks added for configure-python-ci / build-python-ci / test-python-ci.
Relies on CMake default (ON) elsewhere; explicit here avoids silent no-op if the option is ever overridden upstream in the dashboard script.
35fd9b6 to
30c5eaa
Compare
|
@dzenanz I'm running this through some stress tests, and trying to simplify it a little bit (I've tried to squeeze small performance increases, but added more complexity for some of the added features). I'm trying to find where the ROI on complexity is worth it. So far big wins come from the castxml cache ( 30+ minutes saved on many CI runs), and preliminary indications are that using a simple SQL db for data store rather than 1000's of small files is going to be a win as well. This is not ready for review yet...Sorry. |
Fix two latent correctness bugs in the Python wrapping pipeline's pkl
tracking, then add a two-level content-addressed CastXML cache that
reduces CI build time by ~32 min on 2-core AzDO runners (−19%) and by
~3h 12m on GHA ubuntu-24.04 when combined with a warm ccache (−64%).
Supersedes #6486.
What this fixes (correctness)
Ninja graph gap —
igenerator.pywrites one.pklfile per wrappedclass (~816 files per full ITK build) into
itk-pkl/as side effects.None of those files appear as CMake
OUTPUTs; only the.index.txtmanifests do. When
itk-pkl/*.pklfiles are absent (fresh clone, partialninja -t clean, CI agent reusing a workspace with a stale build tree)while the
.index.txtfiles survive, ninja considersigenerator.pyup-to-date.
pyi_generator.pythen fails with:Fix (commit 1): a per-module
<Module>.stampfile is written byigenerator.pyafter all pkl work for that module completes and isdeclared as a CMake
OUTPUT. Ninja now rerunsigenerator.pywheneverthe stamp is absent, guaranteeing the pkl files exist before
pyi_generator.pyruns.Parallel write race — the ~96 concurrent
igenerator.pyjobs (one perITK module) all write individual
.pklfiles into the sameitk-pkl/directory with no locking. Under adverse scheduling this produces torn
reads or silently truncated files.
Fix (commit 2): replace the directory of per-class
.pklfiles with asingle WAL-mode SQLite database (
itk-pkl-v1.db). SQLite WAL mode isdesigned for concurrent writers; the race is structurally eliminated.
The
.index.txtmanifests now hold DB keys (not file paths), andpyi_generator.pyqueries the DB via those keys.What this adds (performance)
Commit 3 defers
import pygccxmluntil after the submodule list isbuilt from the
.mdxfiles. This separates the pure-I/O discovery phasefrom the pygccxml/CastXML-dependent processing phase, laying the
groundwork for a future early-return when all submodule data is already
cached.
Commit 4 adds
itk-castxml-cache.py, a transparent wrapper for theCastXML binary with a two-level content-addressed cache:
.cxxcontent +.castxml.incflags → L2 key. Survives
ninja -t cleanbecause the CastXML binary isfingerprinted by content, not mtime.
castxml -Eoutput withline markers stripped →
output.xml.gz. Path-independent: any builddirectory on the same source tree produces the same L2 key.
Enable via
ITK_WRAP_CASTXML_CACHE(defaultON). SetITK_WRAP_CACHEto override the cache root (default
~/.cache/itk-wrap).CI wiring added:
Cache@2tasks in all three Azure DevOps Pythonpipelines and GHA
arm.yml;configure-python-ci/build-python-ci/test-python-cipixi tasks added.Measured CI performance (PR #6486 runs, GHA + AzDO)
All measurements taken from the
ci/linux-azure-disk-managementWIPbranch (same implementation). The clean branch (
enh/igenerator-sqlite-pkl)produces identical build artifacts.
GHA ubuntu-24.04 (2-core runner, PR #6486):
Cold-to-warm: both the CastXML cache and ccache were cold on the first
run and warm on the second; the 64% total reduction is their combined effect.
AzDO ITK.Linux.Python (2-core runner) — CastXML cache in isolation:
Both the cold run (buildId=16132) and the warm run (buildId=16133) had
ccache already warm (restored from prior branch builds in 22–36 s). This
isolates the CastXML cache contribution:
CastXML cache alone saves ~32 min per AzDO Python CI run.
72-core local workstation (gyrus, NVMe SSD, ccache warm):
On a many-core machine CastXML parallelizes across all cores and is not
on the critical path. The cache overhead is <30 s on a cold seed pass and
the warm speedup is ~19 s (4%). The cache is most valuable on CI runners
where CastXML is core-count-limited.
Why this supersedes #6486
PR #6486 (
ci/linux-azure-disk-management) was a rambling work-in-progressused to accumulate CI benchmark data. It contains:
.github/workflows/python.yml)This PR (
enh/igenerator-sqlite-pkl) is a clean rewrite fromupstream/mainwith the same final implementation in five independently-reviewable commits.
The igenerator LRU cache removal commit is absent because that code never
existed on
upstream/main; only the net-positive changes are present.